Goto

Collaborating Authors

 policy specification


Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach

Meli, Daniele, Castellini, Alberto, Farinelli, Alessandro

arXiv.org Artificial Intelligence

Partially Observable Markov Decision Processes (POMDPs) are a powerful framework for planning under uncertainty. They allow to model state uncertainty as a belief probability distribution. Approximate solvers based on Monte Carlo sampling show great success to relax the computational demand and perform online planning. However, scaling to complex realistic domains with many actions and long planning horizons is still a major challenge, and a key point to achieve good performance is guiding the action-selection process with domain-dependent policy heuristics which are tailored for the specific application domain. We propose to learn high-quality heuristics from POMDP traces of executions generated by any solver. We convert the belief-action pairs to a logical semantics, and exploit data- and time-efficient Inductive Logic Programming (ILP) to generate interpretable belief-based policy specifications, which are then used as online heuristics. We evaluate thoroughly our methodology on two notoriously challenging POMDP problems, involving large action spaces and long planning horizons, namely, rocksample and pocman. Considering different state-of-the-art online POMDP solvers, including POMCP, DESPOT and AdaOPS, we show that learned heuristics expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specific heuristics within lower computational time. Moreover, they well generalize to more challenging scenarios not experienced in the training phase (e.g., increasing rocks and grid size in rocksample, incrementing the size of the map and the aggressivity of ghosts in pocman).


Interpretable Policy Specification and Synthesis through Natural Language and RL

Tambwekar, Pradyumna, Silva, Andrew, Gopalan, Nakul, Gombolay, Matthew

arXiv.org Artificial Intelligence

Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.


Semantic Matching of Security Policies to Support Security Experts

Benammar, Othman, Elasri, Hicham, Sekkaki, Abderrahim

arXiv.org Artificial Intelligence

Management of security policies has become increasingly difficult given the number of domains to manage, taken into consideration their extent and their complexity. Security experts has to deal with a variety of frameworks and specification languages used in different domains that may belong to any Cloud Computing or Distributed Systems. This wealth of frameworks and languages make the management task and the interpretation of the security policies so difficult. Each approach provides its own conflict management method or tool, the security expert will be forced to manage all these tools, which makes the field maintenance and time consuming expensive. In order to hide this complexity and to facilitate some security experts tasks and automate the others, we propose a security policies aligning based on ontologies process; this process enables to detect and resolve security policies conflicts and to support security experts in managing tasks.